Our method of rendering impostor spheres is very similar to our method of rendering mesh spheres. In both cases, we set uniforms that define the sphere's position and radius. We bind a material uniform buffer, then bind a VAO and execute a draw command. We do this for each sphere.
However, this seems rather wasteful for impostors. Our per-vertex data for the impostor is really the position and the radius. If we could somehow send this data 4 times, once for each square, then we could simply put all of our position and radius values in a buffer object and render every sphere in one draw call. Of course, we would also need to find a way to tell it which material to use.
We accomplish this task in the Geometry Impostor tutorial project. It looks exactly the same as before; it always draws impostors, using the depth-accurate shader.
To see how this works, we will start from the front of the rendering pipeline and follow the data. This begins with the buffer object and vertex array object we use to render.
Example 13.5. Impostor Geometry Creation
glBindBuffer(GL_ARRAY_BUFFER, g_imposterVBO); glBufferData(GL_ARRAY_BUFFER, NUMBER_OF_SPHERES * 4 * sizeof(float), NULL, GL_STREAM_DRAW); glGenVertexArrays(1, &g_imposterVAO); glBindVertexArray(g_imposterVAO); glEnableVertexAttribArray(0); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 4 * sizeof(float), (void*)(0)); glEnableVertexAttribArray(1); glVertexAttribPointer(1, 1, GL_FLOAT, GL_FALSE, 4 * sizeof(float), (void*)(12)); glBindVertexArray(0); glBindBuffer(GL_ARRAY_BUFFER, 0);
This code introduces us to a new feature of
glVertexAttribPointer
. In all prior cases the fifth
parameter was 0. Now it is 4 * sizeof(float)
. What does this
parameter mean?
This parameter is the array's stride
. It is the number of bytes
from one value for this attribute to the next in the buffer. When this parameter is
0, that means that the actual stride is the size of the base type
(GL_FLOAT
in our case) times the number of components. When
the stride is non-zero, it must be larger than that value.
What this means for our vertex data is that the first 3 floats represent attribute 0, and the next float represents attribute 1. The next 3 floats is attribute 0 of the next vertex, and the float after that is attribute 1 of that vertex. And so on.
Arranging attributes of the same vertex alongside one another is called interleaving. It is a very useful technique; indeed, for performance reasons, data should generally be interleaved where possible. One thing that it allows us to do is build our vertex data based on a struct:
struct VertexData { glm::vec3 cameraPosition; float sphereRadius; };
Our vertex array object perfectly describes the arrangement of data in an array of
VertexData
objects. So when we upload our positions and
radii to the buffer object, we simply create an array of these structs, fill in the
values, and upload them with glBufferData
.
So, our vertex data now consists of a position and a radius. But we need to draw four vertices, not one. How do we do that?
We could replicate each vertex data 4 times and use some simple
gl_VertexID
math in the vertex shader to figure out which
corner we're using. Or we could get complicated and learn something new. That new
thing is an entirely new programmatic shader stage: geometry
shaders.
Our initial pipeline discussion ignored this shader stage, because it is an entirely optional part of the pipeline. If a program object does not contain a geometry shader, then OpenGL just does its normal stuff.
The most confusing thing about geometry shaders is that they do not shade geometry. Vertex shaders take a vertex as input and write a vertex as output. Fragment shader take a fragment as input and potentially writes a fragment as output. Geometry shaders take a primitive as input and write zero or more primitives as output. By all rights, they should be called “primitive shaders.”
In any case, geometry shaders are invoked just after the hardware that collects vertex shader outputs into a primitive, but before any clipping, transforming or rasterization happens. Geometry shaders get the values output from multiple vertex shaders, performs arbitrary computations on them, and outputs one or more sets of values to new primitives.
In our case, the logic begins with our drawing call:
glBindVertexArray(g_imposterVAO); glDrawArrays(GL_POINTS, 0, NUMBER_OF_SPHERES); glBindVertexArray(0);
This introduces a completely new primitive and primitive type:
GL_POINTS.
Recall that multiple primitives can have the same
base type. GL_TRIANGLE_STRIP
and GL_TRIANGLES
are both separate primitives, but both generate triangles.
GL_POINTS
does not generate triangle primitives; it generates
point primitives.
GL_POINTS
interprets each individual vertex as a separate point
primitive. There are no other forms of point primitives, because points only contain
a single vertex worth of information.
The vertex shader is quite simple, but it does have some new things to show us:
Example 13.6. Vertex Shader for Points
#version 330 layout(location = 0) in vec3 cameraSpherePos; layout(location = 1) in float sphereRadius; out VertexData { vec3 cameraSpherePos; float sphereRadius } outData; void main() { outData.cameraSpherePos = cameraSpherePos; outData.sphereRadius = sphereRadius; }
VertexData
is not a struct definition, though it does look
like one. It is an interface block definition. Uniform blocks
are a kind of interface block, but inputs and outputs can also have interface
blocks.
An interface block used for inputs and outputs is a way of collecting them into
groups. One of the main uses for these is to separate namespaces of inputs and
outputs using the interface name (outData
, in this case). This
allows us to use the same names for inputs as we do for their corresponding outputs.
They do have other virtues, as we will soon see.
Do note that this vertex shader does not write to gl_Position.
That is not necessary when a vertex shader is paired with a geometry shader.
Speaking of which, let's look at the global definitions of our geometry shader.
Example 13.7. Geometry Shader Definitions
#version 330 #extension GL_EXT_gpu_shader4 : enable layout(std140) uniform; layout(points) in; layout(triangle_strip, max_vertices=4) out; uniform Projection { mat4 cameraToClipMatrix; }; in VertexData { vec3 cameraSpherePos; float sphereRadius; } vert[]; out FragData { flat vec3 cameraSpherePos; flat float sphereRadius; smooth vec2 mapping; };
The #extension
line exists to fix a compiler bug for
NVIDIA's OpenGL. It should not be necessary.
We see some new uses of the layout
directive. The
layout(points) in
command is geometry shader-specific. It
tells OpenGL that this geometry shader is intended to take point primitives. This is
required; also, OpenGL will fail to render if you try to draw something other than
GL_POINTS
through this geometry shader.
Similarly, the output layout definition states that this geometry shader outputs
triangle strips. The max_vertices
directive states that we will
write at most 4 vertices. There are implementation defined limits on how large
max_vertices
can be. Both of these declarations are required
for geometry shaders.
Below the Projection
uniform block, we have two interface
blocks. The first one matches the definition from the vertex shader, with two
exceptions. It has a different interface name. But that interface name also has an
array qualifier on it.
Geometry shaders take a primitive. And a primitive is defined as some number of vertices in a particular order. The input interface blocks define what the input vertex data is, but there is more than one set of vertex data. Therefore, the interface blocks must be defined as arrays. Granted, in our case, it is an array of length 1, since point primitives have only one vertex. But this is still necessary even in that case.
We also have another output fragment block. This one matches the definition from
the fragment shader, as we will see a bit later. It does not have an instance name.
Also, note that several of the values use the flat
qualifier. We
could have just used smooth
, since we're passing the same values
for all of the triangles. However, it's more descriptive to use the
flat
qualifier for values that are not supposed to be
interpolated. It might even save performance.
Here is the geometry shader code for computing one of the vertices of the output triangle strip:
Example 13.8. Geometry Shader Vertex Computation
//Bottom-left mapping = vec2(-1.0, -1.0) * g_boxCorrection; cameraSpherePos = vec3(vert[0].cameraSpherePos); sphereRadius = vert[0].sphereRadius; cameraCornerPos = vec4(vert[0].cameraSpherePos, 1.0); cameraCornerPos.xy += vec2(-vert[0].sphereRadius, -vert[0].sphereRadius) * g_boxCorrection; gl_Position = cameraToClipMatrix * cameraCornerPos; gl_PrimitiveID = gl_PrimitiveIDIn; EmitVertex();
This code is followed by three more of these, using different mapping and offset
values for the different corners of the square. The
cameraCornerPos
is a local variable that is re-used as
temporary storage.
To output a vertex, write to each of the output variables. In this case, we have
the three from the output interface block, as well as the built-in variables
gl_Position
and gl_PrimitiveID
(which we
will discuss more in a bit). Then, call EmitVertex()
; this
causes all of the values in the output variables to be transformed into a vertex
that is sent to the output primitive type. After calling this function, the contents
of those outputs are undefined. So if you want to use the same value for multiple
vertices, you have to store the value in a different variable or recompute
it.
Note that clipping, face-culling, and all of that stuff happens after the geometry shader. This means that we must ensure that the order of our output positions will be correct given the current winding order.
gl_PrimitiveIDIn
is a special input value. Much like
gl_VertexID
from the vertex shader,
gl_PrimitiveIDIn
represents the current primitive being
processed by the geometry shader (once more reason for calling it a primitive
shader). We write this to the built-in output gl_PrimitiveID
, so
that the fragment shader can use it to select which material to use.
And speaking of the fragment shader, it's time to have a look at that.
Example 13.9. Fragment Shader Changes
in FragData { flat vec3 cameraSpherePos; flat float sphereRadius; smooth vec2 mapping; }; out vec4 outputColor; layout(std140) uniform; struct MaterialEntry { vec4 diffuseColor; vec4 specularColor; vec4 specularShininess; //ATI Array Bug fix. Not really a vec4. }; const int NUMBER_OF_SPHERES = 4; uniform Material { MaterialEntry material[NUMBER_OF_SPHERES]; } Mtl;
The input interface is just the mirror of the output from the geometry shader. What's more interesting is what happened to our material blocks.
In our original code, we had an array of uniform blocks stored in a single uniform buffer in C++. We bound specific portions of this material block when we wanted to render with a particular material. That will not work now that we are trying to render multiple spheres in a single draw call.
So, instead of having an array of uniform blocks, we have a uniform block that contains an array. We bind all of the materials to the shader, and let the shader pick which one it wants as needed. The source code to do this is pretty straightforward.
Notice that the material specularShininess
became a
vec4 instead of a simple float. This is due to an
unfortunate bug in ATI's OpenGL implementation.
As for how the material selection happens, that's simple. In our case, we use the
primitive identifier. The gl_PrimitiveID
value written from the
vertex shader is used to index into the Mtl.material[]
array.
Do note that uniform blocks have a maximum size that is hardware-dependent. If we wanted to have a large palette of materials, on the order of several thousand, then we may exceed this limit. At that point, we would need an entirely new way to handle this data. Once that we have not learned about yet.
Or we could just split it up into multiple draw calls instead of one.